1 Example with fish data from NORS
In this example we are interested in exploring data from a specific data resource – Sjöprovfiskedatabasen NORS (Institutionen för akvatiska resurser, SLU). This data base has 2.8 M observations starting in the 1950’s.
As you may already know, SBDI is a collection of many biodiversity databases. We start by searching for the data resource we are interested in using the function pick_filter(). This is an interactive query guiding you through the many resources available to filtering your query (data resources, spatial layers, and curated species lists).
fq_str <- pick_filter("resource")
## follow the instructions Follow the instruction. Your choices here would have been “in3” –> “dr20.” Your variable fq_str will now contain a string “data_resource_uid:dr20.”
## we cant do it interactive here so we force it
fq_str <- "data_resource_uid:dr20"But we are not interested in the complete database, but on the last 10 years of data. for this we concatenate (add to a vector) another filter string. These will be treated as AND factors.
y1 <- 2012
y2 <- 2021
fq_str <- c(fq_str, paste0("year:[", y1, " TO ", y2,"]"))
## Note the square brackets are hard limitsFor references on how to use the filters see documentation at docs.
Using the function occurrences() we can the query for the observations fulfilling our filter. If you haven’t specified that in the sbdi_config() before (see [Case 1: Example with fish data from NORS] Installation), you need to pass your email and the download reason.
library(SBDI4R)
xf <- occurrences(fq = fq_str,
email = "test@test.org",
download_reason_id=10)
## Simply summarise all records by data source
table(xf$data$dataResourceName)
#>
#>
#> 29
#> SLU Aqua Institute of Freshwater Research National register of survey test-fishing - NORS
#> 1875081.1 Plotting data on a map
You can quickly plot all the observations as a PDF file with the function ocurrence_plot(), one page per species:
occurrences_plot(x, "obsPlot.pdf",
grouped=FALSE,
taxon_level="species",
pch='+')Note that the plot is saved to a pdf file in the current working directory. You can find that with getwd().
1.1.0.1 Leaflet
There are many other ways of producing spatial plots in R. The leaflet package provides a simple method of producing browser-based maps with panning, zooming, and background layers:
library(leaflet)
# drop any records with missing lat/lon values
xfl <- xf$data[!is.na(xf$data$longitude) | !is.na(xf$data$latitude),]
marker_colour <- rep("#d95f02", nrow(xfl))
# blank map, with imagery background
leaflet(width = "100%") %>%
addProviderTiles("Esri.WorldImagery") %>%
# add markers
addCircleMarkers(xfl$longitude, xfl$latitude,
radius = 1,
fillOpacity =.5,
opacity = 1,
col=marker_colour,
clusterOptions = markerClusterOptions())1.2 Temporal analysis
A quick summary over the years reveal a drop in number of records over time, and stops .
table(xf$data$year)
#>
#> 2012 2013 2014 2015 2016 2017 2018 2019
#> 69685 68470 35715 4425 2900 2019 2417 1906hist(xf$data$year,
breaks = seq(y1, y2),
xlab = "Year",
main = "")
1.3 Species summary
In the same way we can summaries the number of observations for each species, by common or scientific name.
sppTab <- table(xf$data$commonName)
sppDF <- as.data.frame(sppTab)
colnames(sppDF)[1] <- "species"
sppDF
#> species Freq
#> 1 19481
#> 2 Alaska fourhorn sculpin 470
#> 3 Alpine bullhead 614
#> 4 American burbot 7982
#> 5 Aral asp 717
#> 6 Arctic char 4827
#> 7 Baltic vimba 304
#> 8 bleak 11355
#> 9 Blue bream 502
#> 10 brown trout 3827
#> 11 Bullhead 91
#> 12 carps 1161
#> 13 common carp, carp 20
#> 14 Common dace 357
#> 15 crucian carp 695
#> 16 eel 79
#> 17 European pike-perch 6791
#> 18 European whitefish 8816
#> 19 grayling 706
#> 20 gudgeon 238
#> 21 ide, orfe 1194
#> 22 Miller's thumbs 361
#> 23 minnow 1975
#> 24 perch 28520
#> 25 pike 21319
#> 26 Pope 15933
#> 27 rainbow trout 94
#> 28 roach 24811
#> 29 rudd 6485
#> 30 salmon 35
#> 31 smelt 7118
#> 32 spined loach 850
#> 33 sprat 40
#> 34 ten-spined stickleback, nine-spined stickleback 128
#> 35 tench 4492
#> 36 white bream 5149sppTab <- table(xf$data$scientificName)
sppDF <- as.data.frame(sppTab)
colnames(sppDF)[1] <- "species"
sppDF
#> species Freq
#> 1 Abramis brama (Linnaeus, 1758) 11616
#> 2 Alburnus alburnus (Linnaeus, 1758) 11355
#> 3 Anguilla anguilla (Linnaeus, 1758) 79
#> 4 Ballerus ballerus (Linnaeus, 1758) 502
#> 5 Blicca bjoerkna (Linnaeus, 1758) 5149
#> 6 Carassius carassius (Linnaeus, 1758) 695
#> 7 Cobitis taenia Linnaeus, 1758 850
#> 8 Coregonus albula albula 7865
#> 9 Coregonus maraena (Bloch, 1779) 8816
#> 10 Cottus gobio Linnaeus, 1758 91
#> 11 Cottus Linnaeus, 1758 361
#> 12 Cottus poecilopus Heckel, 1840 614
#> 13 Cyprinidae 1161
#> 14 Cyprinus carpio Linnaeus, 1758 20
#> 15 Esox lucius Linnaeus, 1758 21319
#> 16 Gobio gobio (Linnaeus, 1758) 238
#> 17 Gymnocephalus cernua (Linnaeus, 1758) 15933
#> 18 Leuciscus aspius (Linnaeus, 1758) 717
#> 19 Leuciscus idus (Linnaeus, 1758) 1194
#> 20 Leuciscus leuciscus (Linnaeus, 1758) 357
#> 21 Lota lota (Linnaeus, 1758) 7982
#> 22 Myoxocephalus quadricornis (Linnaeus, 1758) 470
#> 23 Oncorhynchus mykiss (Walbaum, 1792) 94
#> 24 Osmerus eperlanus (Linnaeus, 1758) 7118
#> 25 Perca fluviatilis Linnaeus, 1758 28520
#> 26 Phoxinus phoxinus (Linnaeus, 1758) 1975
#> 27 Pungitius pungitius (Linnaeus, 1758) 128
#> 28 Rutilus rutilus (Linnaeus, 1758) 24811
#> 29 Salmo salar Linnaeus, 1758 35
#> 30 Salmo trutta Linnaeus, 1758 3827
#> 31 Salvelinus alpinus (Linnaeus, 1758) 4827
#> 32 Sander lucioperca (Linnaeus, 1758) 6791
#> 33 Scardinius erythrophthalmus (Linnaeus, 1758) 6485
#> 34 Sprattus sprattus (Linnaeus, 1758) 40
#> 35 Thymallus thymallus (Linnaeus, 1758) 706
#> 36 Tinca tinca (Linnaeus, 1758) 4492
#> 37 Vimba vimba (Linnaeus, 1758) 304Perhaps, you need to send this table as a .CSV file to a colleague.
write.csv(sppDF, "NORS species summary.csv")
# NOTE: again this will be saved on your working directory1.4 Spatial biodiversity analysis
Let’s now ask: how does the species richness vary across Sweden? In this case we want to summarise occurrences species-wise over a defined grid instead of plotting every observation point. First we need to overlay the observations with a grid. In this case, the standard Swedish grids at 50, 25, 10 and 5 km are provided as data in the SBDI4R package (with Coordinate Reference System = WGS84, EPSG:4326).
library(sp) # the function coordinates() and proj4string() are in sp
library(rgeos) # the function over() is in package rgeos
# load some shapes over Sweden's political borders
data("swe_wgs84", package="SBDI4R", envir=environment())
# A standard 50km grid
data("Sweden_Grid_50km_Wgs84", package="SBDI4R", envir=environment())
grid <- Sweden_Grid_50km_Wgs84
# grid <- spTransform(grid, CRS("+init=epsg:4326")) # as mentioned above, the grid already has has the same CRS, but changes are undergoing in the sp package and this step is needed
# make the observations spatial
# NOTE: make sure there are no NAs on either column defining the coordinates
# xf$data[!is.na(xf$data$longitude) | !is.na(xf$data$latitude),]
obs <- as.data.frame(xf$data)
coordinates(obs) <- obs[,c("longitude","latitude")]
wkt <- sf::st_crs(4326)[[2]]
proj4string(obs) <- sp::CRS(wkt) #CRS("+init=epsg:4326")
nObs <- nrow(obs)
# overlay the data with the grid
ObsInGridList <- over(grid, obs, returnList=TRUE)
wNonEmpty <- unname( which( unlist(lapply(ObsInGridList, nrow)) != 0) )
if(length(wNonEmpty)==0) message("Observations don't overlap any grid cell.")The result ObsInGridList is a list object with a subset of the data on each grid. Now summarise occurrences within grid cells:
# check n the total number of observations
sum(unlist(lapply(ObsInGridList, nrow)))
#> [1] 187537
# apply a summary over the grid cells
nCells <- length(ObsInGridList)
res <- data.frame("nObs"=as.numeric(rep(NA,nCells)),
"nYears"=as.numeric(rep(NA,nCells)),
"nSpp"=as.numeric(rep(NA,nCells)),
row.names = row.names(grid),
stringsAsFactors = FALSE)
cols2use <- c("scientificName", "year")
dataRes <- lapply(ObsInGridList[wNonEmpty], function(x){
x <- x[,cols2use]
colnames(x) <- c("scientificName", "year")
return(c("nObs" = length(x[,"scientificName"]),
"nYears" = length(unique(x[,"year"])),
"nSpp" = length(unique(x[,"scientificName"]))
))
})
dataRes <- as.data.frame(dplyr::bind_rows(dataRes, .id = "gridID"))
res[wNonEmpty,] <- dataRes[,-1]
resSp <- sp::SpatialPolygonsDataFrame(grid, res)And finally plot the grid summary as a map:
palBW <- leaflet::colorNumeric(c("white", "navyblue"),
c(0, max(resSp@data$nSpp, na.rm = TRUE)),
na.color = "transparent")
oldpar <- par()
par(mar = c(1,1,0,0))
plot(resSp, col=palBW(resSp@data$nSpp), border = NA)
plot(swe_wgs84$Border, border=1, lwd=1, add=T)
legend("bottomleft",
legend = round(seq(0, max(resSp@data$nSpp, na.rm = TRUE), length.out = 5)),
col = palBW(seq(0, max(resSp@data$nSpp, na.rm = TRUE), length.out = 5)),
title = "Number of \nspecies", pch = 15, bty="n")
suppressWarnings(par(oldpar))